Optimizing Recursive Information-Gathering Plans
نویسندگان
چکیده
In this paper we describe two optimization techniques that are specially tailored for information gathering. The first is a greedy minimization algorithm that minimizes an information gathering plan by removing redundant and overlapping information sources without loss of completeness. We then discuss a set of heuristics that guide the greedy minimization algorithm so as to remove costlier information sources first. In contrast to previous work, our approach can handle recursive query plans that arise commonly in practice. Second, we present a method for ordering the access to sources to reduce the execution cost. Sources on the Internet have a variety of access limitations and the execution cost in information gathering is affected both by network traffic and by the connection setup costs. We describe a way of representing the access capabilities of sources, and provide a greedy algorithm for ordering source calls that respects source limitations. It also takes both access costs and traffic costs into account, without requring full source statistics. Finally, we will discuss implementation and empirical evaluation of these methods in Emerac, our prototype information gathering system.
منابع مشابه
Recursive Plans for Information Gathering
Generating query-answering plans for information gathering agents requires to translate a user query, formulated in terms of a set of virtual relations, to a query that uses relations that are actually stored in information sources. Previous solutions to the translation problem produced sets of conjunctive plans, and were therefore limited in their ability to handle information sources with bin...
متن کاملOptimizing source-call ordering in Information Gathering Plans
In this paper we consider the problem of optimizing the order in which source relations are joined in information gathering plans. This problem differs significantly from the traditional database query optimization problem, as sources on the Internet have a variety of access limitations and the execution cost in information gathering is affected both by network traffic and by the connection set...
متن کاملEeciently Executing Information Gathering Plans
The most costly aspect of gathering information over the Internet is that of transferring data over the network to answer the user's query. We make two contributions in this paper that alleviate this problem. First, we present an algorithm for reducing the number of information sources in an information gathering (IG) plan by reasoning with localized closed world (LCW) statements. In contrast t...
متن کاملEfficiently Executing Information Gathering Plans
The most costly aspect of gathering information over the Internet is that of transferring data over the network to answer the user’s query. We make two contributions in this paper that alleviate this problem. First, we present an algorithm for reducing the number of information sources in an information gathering (IG) plan by reasoning with localized closed world (LCW) statements. In contrast t...
متن کاملBuilding a Planner for Information Gathering: A Report from the Trenches
Information gathering requires locating and integrating data from a set of distributed information sources. These sources may contain overlapping data and can come from different types of sources, including traditional databases, knowledge bases, programs, and Web pages. In this paper we focus on the problem of how to apply a general-purpose planner to produce plans for information gathering. W...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1999